Hash Joins and Hash Teams in Microsoft SQL Server

نویسندگان

  • Goetz Graefe
  • Ross Bunker
  • Shaun Cooper
چکیده

The query execution engine in Microsoft SQL Server employs hash-based algorithms for inner and outer joins, semi-joins, set operations (such as intersection), grouping, and duplicate removal. The implementation combines many techniques proposed individually in the research literature but never combined in a single implementation, neither in a product nor in a research prototype. One of the paper’s contributions is a design that cleanly integrates most existing techniques. One technique, however, which we call hash teams and which has previously been described only in vague terms, has not been implemented in prior research or product work. It realizes in hash-based query processing many of the benefits of interesting orderings in sort-based query processing. Moreover, we describe how memory is managed in complex and bushy query evaluation plans with multiple sort and hash operations. Finally, we report on the effectiveness of hashing using two very typical database queries, including the performance effects of hash teams.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Value of Merge-Join and Hash-Join in SQL Server

Microsoft SQL Server was successful for many years for transaction processing and decision support workloads with neither merge join nor hash join, relying entirely on nested loops and index nested loops join. How much difference do additional join algorithms really make, and how much system performance do they actually add? In a pure OLTP workload that requires only record-to-record navigation...

متن کامل

An Adaptive Hash Join Algorithm for Multiuser Environments

As main memory becomes a cheaper resource, hash joins are an alternative to the traditional methods of performing equi-joins: nested loop and merge joins. This paper introduces a modified, adaptive hash join method that is designed to work with dynamic changes in the amount of available memory. The general idea of the algorithm is to regulate resource usage of a hash join in a way that allows i...

متن کامل

On a Three-Way Hash Join Algorithm

We develop hash-based algorithms for computing a three-way join. The method involves hashing all three relations into buckets, and then joining buckets in main memory, three buckets at a time. Comparing to two-cascaded hash joins, the algorithms avoid materializing an intermediate result. We present a cost model for this approach, from which we identify the range of parameters for queries that ...

متن کامل

Adapting Hash Joins For Modern Processors

Hash join algorithms are crucial to the performance of modern database systems. Conventional hash joins exhibit poor memory system performance on modern processors because their key data structure, the bucket-chain hash table, is ill-suited for the performance characteristics of out-of-order processors with large cache hierarchies. Whereas prior research has considered a variety of optimization...

متن کامل

Memory-Efficient Hash Joins

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded populatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998